Applications of Weighted Automata in Natural Language Processing

نویسندگان

  • Kevin Knight
  • Jonathan May
چکیده

Linguistics and automata theory were at one time tightly knit. Very early on, finite-state processes were used by Markov [35, 27] to predict sequences of vowels and consonants in novels by Pushkin. Shannon [48] extended this idea to predict letter sequences of English words using Markov processes. While many theorems about finite-state acceptors (FSAs) and finite-state transducers (FSTs) were proven in the 1950s, Chomsky argued that such devices were too simple to adequately describe natural language [6]. Chomsky employed context-free grammars (CFGs) and then introduced the more powerful transformational grammars (TG), loosely defined in [7]. In attempting to formalize TG, automata theorists like Rounds [46] and Thatcher [52] introduced the theory of tree transducers. Computational linguistics also got going in earnest, with Woods’ use of augmented transition networks (ATNs) for automatic natural language parsing. In the final paragraph of his 1973 tree automata survey [53], Thatcher wrote:

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Tiburon: A Weighted Tree Automata Toolkit

The availability of weighted finite-state string automata toolkits made possible great advances in natural language processing. However, recent advances in syntax-based NLP model design are unsuitable for these toolkits. To combat this problem, we introduce a weighted finite-state tree automata toolkit, which incorporates recent developments in weighted tree automata theory and is useful for na...

متن کامل

NLP Applications Based on Weighted Multi-Tape Automata

This article describes two practical applications of weighted multi-tape automata (WMTAs) in Natural Language Processing, that demonstrate the augmented descriptive power of WMTAs compared to weighted 1-tape and 2-tape automata. The two examples concern the preservation of intermediate results in transduction cascades and the search for similar words in two languages. As a basis for these appli...

متن کامل

Weighted Automata in Text and Speech Processing

Finite-state automata are a very effective tool in natural language processing. However, in a variety of applications and especially in speech precessing, it is necessary to consider more general machines in which arcs are assigned weights or costs. We briefly describe some of the main theoretical and algorithmic aspects of these machines. In particular, we describe an efficient composition alg...

متن کامل

Weighted Automata in Text

Processing Mehryar Mohri, Fernando Pereira and Michael Riley AT&T Research 600 Mountain Avenue Murray Hill, 07974 NJ fmohri,pereira,[email protected] Abstract. Finite-state automata are a very e ective tool in natural language processing. However, in a variety of applications and especially in speech precessing, it is necessary to consider more general machines in which arcs are assigned ...

متن کامل

Minimizing Deterministic Weighted Tree Automata

The problem of efficiently minimizing deterministic weighted tree automata (wta) is investigated. Such automata have found promising applications as language models in Natural Language Processing. A polynomial-time algorithm is presented that given a deterministic wta over a commutative semifield, of which all operations including the computation of the inverses are polynomial, constructs an eq...

متن کامل

Parsing Algorithms based on Tree Automata

We investigate several algorithms related to the parsing problem for weighted automata, under the assumption that the input is a string rather than a tree. This assumption is motivated by several natural language processing applications. We provide algorithms for the computation of parse-forests, best tree probability, inside probability (called partition function), and prefix probability. Our ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007